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Abstract. Let R = (r%, . . . , r m ) and C = (ci, ... , c n ) be positive integer vectors 
such that ri + • • • + r TO = ci + . . . + c„. We consider the set ~S(R, C) of non- negative 

i— i ) m x n integer matrices (contingency tables) with row sums i? and column sums C 

as a finite probability space with the uniform measure. We prove that a random 
table D 6 S(_R, C) is close with high probability to a particular matrix ("typical 
table") Z defined as follows. We let g(x) = {x + 1) ln(a; + 1) — a; In a; for a; > and 
let g(X) = ■ g(xij) for a non-negative matrix X = (xij). Then g(X) is strictly 
concave and attains its maximum on the polytope of non-negative m x n matrices 
X with row sums R and column sums C at a unique point, which we call the typical 

1 table Z. 

m : 
> ■ 

o : 

1. Introduction and the main result 

m ■ 

(1.1) Random contingency tables. Let R = (ri, . . . , r m ) be a positive integer 
m- vector and let C = (ci, . . . , c n ) be a positive integer n- vector such that 

OO : 

m n 

.&: E^ = E c ^ = iV - 

|Xj ■ i=i i=i 

A contingency table with margins (i?, C) is a non- negative integer matrix D = (dij) 
with row sums R and column sums C: 

n m 

22 d ij = r i for £ = 1, • • • , m, 2J = c i for i = • • • > n ' 

j=l i=l 

> and d^ G Z for all 
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Let T,(R,C) be the set of all contingency tables with margins (R,C). As is well 
known, E(i?, C) is non-empty and finite. Let us consider £(-R, C) as a finite prob- 
ability space endowed with the uniform probability measure. In this paper we 
address the following question: 

Suppose that D e £(-R, C) is chosen at random. What is D likely to look like? 

The problem is interesting in its own right, but the main motivation comes 
from statistics; see [Go63], [DE85], [DG95] and references therein. A contingency 
table D = (dij) may represent certain statistical data (for example, dij may be 
the number of people in a certain sample having the i-th hair color and the j- 
th eye color). One can condition on the row and column sums and ask what is 
special about a particular table D e E(i?, C), considering all tables in T,(R,C) as 
equiprobable; see [DE85] . To answer this question we need to know what a random 
table D e C) looks like. Considerable effort was invested in finding an efficient 
(polynomial time) algorithm to sample a random table D e T,(R,C); see [DG95], 
[D+97], [C+06]. Despite a number of successes, such an algorithm is still at large 
in many interesting situations. In this paper, we do not discuss how to sample a 
random table but describe instead what it is likely to look like. 

We prove that a random contingency table D is close in a certain sense to some 
particular non-negative m x n matrix Z, which we call the typical table. 

(1.2) The typical table. Let V(R, C) be the set of all mxn non-negative matrices 
X = (xij) with row sums R and column sums C: 

n m 

= Ti for % = 1, . . . , m, = Cj for j = 1, . . . , n and 

j=l i=l 

Xij > for all 

Geometrically, V(R, C) is a convex polytope of dimension (m — l)(n — 1), known 
as the transportation polytope. Let 

g(x) = (x + 1) ln(x + 1) — x In x for x > 

and let 

i,3 

for a non-negative matrix X = (x^). One can easily check that g is strictly concave 
and hence achieves a unique maximum Z = (zij) on V(R, C). We call Z the typical 
table with margins (R, C) . Since the objective function g is concave, Z can be 
computed efficiently, both in theory and in practice, by existing methods of convex 
optimization, cf. [NN94]. 

The solution Z to the above optimization problem was first introduced in the 
author's paper [Ba09]. It was given the name of "typical table" (perhaps with not 
enough justification) in [B+08]. 

In this paper, we show that Z indeed captures some typical features of a random 
table D E Z(R, C). 

We prove our main result assuming certain regularity ( "smoothness" ) of margins. 
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(1.3) Smooth margins. Let us fix a number < S < 1. First, we assume that 
the row sums and column sums are of the same order: 

SN N 

< Ti < t — tor % = 1, . . . , m and 

/-, q i \ w dm 

( ' " £ c j£ f for ,• = !,...,„. 

n on 

Second, we assume that the density of the table is separated from 0: 

N 

(1.3.2) — > 5. 

mn 

We say that the margins (R, C) are 5-smooth if conditions (1.3.1)-(1.3.2) are satis- 
fied. This is a modification of the definition from [B+08]. We note that 5-smooth 
margins are also J'-smooth for any < 5' < 5. As we remarked (see (1.3.2)), we 
are interested in tables with the density separated from 0. For the case of sparse 
tables, where <C n and Cj <C m, see [Ne69], [GM08] and references therein. 
Without loss of generality, we assume that n > m. 

(1.4) Definitions and notation. Let us choose a non-empty subset of entries of 
a matrix: 

S C j) : 1 < i < m, 1 < j ' < n j. 
For an m x n matrix A = (a^) let 



°s{A)= 



be the sum of the entries from S. 

The cardinality of a finite set X is denoted by |X|. 

Now we state our main result. 

(1.5) Theorem. Let us fix real numbers < 5 < 1 and k > 0. Then there exists 
a positive integer q = q(5, k) such that the following holds: 

Suppose that (i?, C) are 5-smooth margins such that n > m > q. 

Let 

S C {(z, j) : 1 < i < m, 1 < j ' < n} 

be a set such that 

\S\ > 5mn, 



let Z be the typical table with margins (R, C) , and let 



If e < 1 then 



(l-e)(7 S (Z) < <js(D) < (l + e)as(Z)} > 1 



In other words, asymptotically, as far as the sum over a positive fraction of 
entries is concerned, a contingency table D sampled uniformly at random from the 
set of contingency tables with given margins is very likely to be close to the typical 
table Z. 

(1.6) The independence table. In [Go63], I.J. Good observes that the indepen- 
dence table 

Y = (yij) , yij = ncj/N for all 
maximizes the entropy 

H(X) = y ^Mn — 

on the set of all matrices X = (xy) in the transportation polytope V{R, C). One 
may be tempted to think that the independence table Y, not the typical table Z, 
reflects the structure of a random table D € £(-R, C). 

One can show that Y = Z if and only if all row sums Ti are equal or all col- 
umn sums cj are equal. In fact, particular entries of the matrices Z and Y may 
demonstrate very different behavior even for reasonably looking margins. Suppose, 
for example, that m = n, that r\ = c\ = 3n and that = Cj = n for i > 1. Hence 
N = 3n + n(n — I) = n 2 + 2n and for the independence table we have 

9n 2 ^ n 

On the other hand, for the typical table Z the entry z\\ grows linearly in n. Indeed, 
the optimality condition for Z (the gradient of g at Z is orthogonal to the afhne 
span of the transportation polytope) implies that 

In ( — - ) = \ + Uj for all i,j 

V z ij J 

and some Ai, . . . , A m , /zi, . . . , see Section 2.3. By symmetry, we can choose 
Ai = n-\ = a and Aj = fii = (3 for % > 1. Moreover, we must have < a < (3. Since 

*2i = ea+ l _ i > = z v fora11 i >1 



and T2 = n, we should have 



Therefore, 

1 1 1 , . . 

zii = — rs t < ~a t < — 7= tor i > 1. 

J e Q +^ -1 eP-\ y/2-1 

Since r\ = 3n we must have 

n 

z\\ > 3n = > 0.58n. 

y/2-1 

Let us show that the independence table Y and the typical table Z may also produce 
different asymptotic behavior of the sums as(Y) and o~s(Z) as m and n grow and 
S is a subset of entries consisting of a positive fraction of all entries as in Theorem 
1.5. For that, let us fix some margins R = (r±, . . . , r m ) and C = (ci, . . . , c n ) such 
that z\\ 7^ yn. For a positive integer k let us consider the "cloned" margins 

Rk = [kn, ... , fen, . . . , /cr m , . . . , /cr m ) and 

(1.6.1) 



k times k times 



Ck — ( kc±, . . . , kci, • • • , kc n , . . . , kr n I . 
V ' s // 



k times k times 

In particular, tables D e S(-Rfc, Cfc) are /cm x kn matrices whose total sum of entries 
is equal to k 2 N, where iV = n + . . . + r m = c\ + . . . + c n . Let S = Sk be the set 
of entries in the upper left k x k corner of a matrix from E(i2fc, Cfc), let be the 
independence table of margins (Rk, Ck) and let Zk be the typical table of margins 
(Rk,Ck). It is not hard to show that as(Zk) = k 2 zn and as(Yk) = k 2 yu, so the 
ratio between the two sums remains fixed (and not equal to 1) as k grows. 

It looks plausible that the independence table Y is indeed close with high prob- 
ability to a random table D e T,(R,C), if, instead of the uniform distribution in 
T,(R, C), a table D = (dij) is sampled from the Fisher- Yates probability measure, 
where 

Pr (D) = (N\y 

see [DG95]. Compared with the uniform distribution, the Fisher- Yates measure 
gives less weight to tables with large entries. 

Let p, q > be real numbers such that p + q = 1. Recall that a discrete random 
variable x has geometric distribution if 

Pr {x = k} = pq k for k = 0, 1, . . . 

We have 

Ex= q -. 

p 

Consequently, 

1 

if Ex = z then p = and q = 




1+z 1+z 
The following interpretation of the typical matrix was suggested to the author by 
J. A. Hartigan; see [BH09]. 
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(1.7) Theorem. Let Z = (zij) be the mxn typical table with margins (R, C). Let 
X = (xij) be the random mxn matrix of independent geometric random variables 
Xij such that 

Exij = for all 

Then the probability mass function of X is constant on the set E(i£, C) of contin- 
gency tables with margins (R, C) , and, moreover, 

Pr {X = D} = e~ 9(z) for all D e E(#, C), 

where g is the function defined in Section 1.2. 

In other words, the multivariate geometric distribution X whose expectation is 
the typical matrix Z, when conditioned on the set E(i2, C) of contingency tables, 
results in the uniform probability distribution on H(R,C). It turns out that for 
a positive mxn matrix A the value of g(A) is equal to the maximum possible 
entropy of a random matrix with expectation A and values in the set Z™ Xn of 
mxn non-negative integer matrices. Such a maximum entropy random matrix is 
necessarily a matrix with independent geometrically distributed entries. Therefore, 
the distribution of X in Theorem 1.7 can be characterized as the maximum entropy 
distribution in the class consisting of all probability distributions on Z™ Xn whose 
expectations lie in the affine subspace consisting of the matrices with row sums R 
and column sums C; see [BH09]. 

(1.8) Possible ramifications and open questions. Theorem 1.7 allows one to 
interpret Theorem 1.5 as a law of large numbers for contingency tables: with re- 
spect to sums <Js{D) for sufficiently large sets S of entries, a random contingency 
table D e E(i2, C) behaves approximately as the matrix of independent geomet- 
ric variables whose expectation is the typical table. Similar concentration results 
can be obtained for other well-behaved functions on contingency tables. One can 
ask whether the distribution of a particular entry of a random table D e E(i2, C) 
is asymptotically geometric, as the dimensions m and n of the table grow. For 
example, does the first entry d\\ of the table converge in distribution to the geo- 
metric random variable with expectation z\\ when the margins (R, C) are cloned, 
{R,C)^ {R k ,C k ), as in (1.6.1)? 

Let us fix a subset 

W C |(i,j) : % = 1,... ,m; j = 1,... ,n|. 

Let us consider the set E(i2, C; W) of mxn non- negative integer matrices D = (dij) 
with row sums R, column sums C and such that d^ = for ^ W. Assuming 
that E(i2, C; W) is non-empty, we can consider E(i?, C; W) as a finite probability 
space with the uniform measure and ask what a random table D e H(R,C;W) 
looks like. 

As above, we define the typical table Z as the unique maximum of g(X) on the 
polytope of non-negative matrices X = (x^) with row sums R, column sums C 
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and such that Xij = for (z, j) £ W . One can prove versions of Theorem 1.5 and 
Theorem 1.7 in this more general context for subsets S C W. However, it appears 
that for Theorem 1.5 one has to assume, additionally, that there are no too large 
or too small values among the entries Zij of the typical table Z = (zij), cf. the 
example in Section 1.6. In our case, when W is the set of all pairs (i, j), Lemma 2.4 
ensures that the entries are not too small while Lemma 3.3 ensures that they 
are not too large. 

In [Ba08] another variation of the problem is considered: what if we require 
dij G {0, 1} for all It turns out that a random D is close to a particular matrix 
maximizing the sum of entropies of the entries among all matrices with row sums 
.R, column sums C and entries between and 1. 

In the rest of the paper, we prove Theorem 1.5. 

In Section 2, we recall the main results of [Ba09] connecting the typical table 
Z with an asymptotic estimate for the number C)\ of tables and also prove 

Theorem 1.7. 

In Section 3, we prove Theorem 1.5 under the additional assumption that the 
total sum N of the entries is bounded by a polynomial in m and n. 
In Section 4, we complete the proof of Theorem 1.5. 

2. Preliminaries: an asymptotic formula for the number of tables 

In [Ba09], the following result was proved; see Theorem 1.1 there. 

(2.1) Theorem. Let R = (rr,... ,r m ) and C = (c\, . . . ,c n ) be positive integer 
vectors such that r\ + . . . + r m = c\ + . . . + c n = N. Let us define a function 

(m \ / n 

for x = (xi, . . . ,x m ) and y = (y lt . . . , y n ) . 
Then F(x, y) attains its infimum 

p(R,C)= min F(x,y) 

0<xi,... ,x m <l 
0<yi,--- ,y n <l 

on the open cube < Xi,yj < 1 and for the number \T,(R,C)\ of non-negative 
integer m x n matrices with row sums R and column sums C we have 

p(R,C) > \X(R,C)\ > N-^ m+ ^p(R,C), 

where 7 > is an absolute constant. 

□ 




As is remarked in [Ba09], the substitution Xi = e Si , yj = e tj transforms 
lnF(x, y) into a convex function 

m n 

G( S , t) = j2 + cjti - E ln i 1 - e ~ Sl ~ tj ) 

i=l j = l i,j 

for s = (si, . . . ,s m ) and t = {t u . . . , t n ) 

on the positive orthant IRIp x W±. It turns out that the typical table Z is the 
solution to the problem that is convex dual to the problem of minimizing G. The 
following result was proved in [Ba09]; see Lemma 1.4 there. 

(2.2) Lemma. Let V = V(R, C) be the polytope of m x n non-negative matrices 
X = (xij) with row sums R and column sums C and let Z e V(R, C) be the typical 
table; see Section 1.2. 

Then one can write Z = (zij), 



and some < £i, . . . ,£ m ',r)i, . . . , rj n < 1 such that the minimum p(R,C) of the 
function F(x, y) in Theorem 2.1 is attained at x* = (£i,...,£ m ) an d 
y* = (Vi, ■ ■ ■ ,Vn): 

F(x*,y*)=p( J R,C)= min F(x,y). 

0<xx,... ,x m <l 
0<yi,... ,y„<l 

Moreover, 

p(R,C) = exp{g(Z)}. 

□ 

Theorem 1.7 is a particular case of a more general result proved in [BH09]. 
Nevertheless, we present the proof of Theorem 1.7 here for completeness and since 
some elements of the proof will be recycled later. 

(2.3) Proof of Theorem 1.7. From Lemma 2.2, we have z%j > for all Since 
Z lies in the relative interior of the transportation polytope V(R, C), the gradient 
of g at Z must be orthogonal to the subspace ofmxn matrices with row and 
column sums equal to 0. Therefore, 

(2.3.1) ln( ^ +1 ^ =\i+Hj for all i,j 



and some Ai, . . . , A m and /Ui, . . . , [i n - 
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For the geometric random variables Xij we have 
Using (2.3.1), for D e E(R,C), D = (dij), we obtain 



z ij 



1 1 T ' 



Pr {X = D} = JJ YTT" II ( I 



/ 



"I - ^ii / . . V 1 ~\~ %ij 

1,3 / hJ 



-(Xi+fij)d 



\ M J / *,3 

II , - ) (II' )(II 



m \ I n 



i , ^ Z " l ° ) \i=l / \j=l 



Also, 



1 1 TT / ^ij 



=n 

(H • jH'M 

\K7 J / *,3 

(ll • )H 



i,3 / *,J 



g c j 



which completes the proof. □ 
We will need a lower bound for the entries of the typical table Z = (z^) proved 
in [B+08]; see Theorem 3.3 there. 

(2.4) Lemma. Let 

r + = max r^, r_ = min and 

i=l,... ,m i=l,... ,m 

c_|_ = max Cj, c_ = min Cj. 

j=l, ...,n 3=1, ...,n 

Let Z = (zij) be the typical table with margins (R, C). Then 

j* g g y 

Zij > and > for all 

r+m c+n 

□ 
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(2.5) Corollary. Let Z = (zij) be the typical table of 5 -smooth margins (R,C). 
Then 

z i:j > for all 



mn 

Proof. In Lemma 2.4, we have 

5N 5N N 

r_ > , c_ > and r + < - — , 

m n dm 

and the result follows. □ 

3. Proof of Theorem 1.5 assuming that N is polynomially bounded 

In this section we prove Theorem 1.5 under the additional assumption that the 
total sum iV of entries is bounded by a polynomial in m and n, specifically that 
N < {mn) 1 ' 5 . We use Theorem 1.7. We start with a standard large deviation 
inequality. 

(3.1) Lemma. Let X = (jy) be the mxn matrix of independent geometric random 
variables Xij such that EX = Z, Z = (zij). Let 

Sc{(i,j): l<z<m, l<j<nj 
be a non-empty set. Recall that 

(i,j)es (i,j)es 



and let us denote 



(i,j)es 

Then 

(1) For any real a and for any < t < 2, we have 



Vr{a s (X) < - a + a s (Z)} < exp j-ta + *- (a s (Z) + u s {Z)) | . 

(2) For any real a and for any < t < min{l/3, 1/2^ : G S 1 }, we have 

Pr {<rs(X) > a + a s (Z)} < exp{-ta + 2t 2 (a s (Z) + v s {Zj) }. 
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Proof. We use the Laplace transform method; see, for example, Section 1.6 of 
[LeOl]. To prove Part (1), for any t > we compute 

Ee -ta s (x) = TT Ee -tx i0 = TT PM 

11 11 1-e-W 

(i,i)es (i,j)es yiJ 

where 

Pr {xy = A;} = pyg-j- for fc = 0,l,... 
Using the fact that e _t < 1 - 1 + t 2 /2 for t > 0, we obtain 

Using the fact that t - t 2 /2 > for < t < 2 and that ln(l + x) > x - x 2 /2 for 
x > 0, we obtain 

Ee -t*s(x) < eX p J - J2 H 1 + ~ t2 / 2 ) z ij) 

{ (i,j)€S 

{ (ij)es (ij)es ) 

< exp^-ta s (Z) + t -(as(Z)+u s (Z))Y 

Then 

Pr {a s (X) < -a + a s (Z)} = Pr {-ta s (X) > ta-ta s (Z)} 

= Pr |e" t<Js(x) > e ta -^s(z)^ 

< e -ta+to- s (Z) Ee -to- s (X) 



< exp|-to+^-((T S (Z) + i/s(Z))| 



To prove Part (2), we observe that e* < 1 + 1 + t 2 for all < t < 1. Therefore, 
for < t < min{l/3, l/2zij : we have 



e* < 1 + 2* < 



1 + 2* j 



2ij Oij 

and hence 



(i,j)es (i,j)es yjJ 

Py TT 1 



< TT ^ = TT 

~ 11 va - (t + t 2 )a^ . J-l l - 
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Since t < 1/3 we have t + t 2 < (4/3)* and hence (t + t 2 )z i3 < 2/3. Using the fact 
that ln(l — x) > —x — x 2 for < x < 2/3, we obtain 



Ee ta s (x) < exp J_ ln(l - (t + t 2 ) ZlJ ) 

{ (i,j)€S 

((i,j)es (i,j)es 
< exp{ta s (Z) +2t 2 (a s (Z) + v s {Z))}- 

Therefore, 

Pr{a s {X) > a + a s (Z)} = Pr {ta s (X) > ta + ta s (Z)} 

= Pr \e tas{x) > e ta +^s(z)^ 

< e ~ ta ~ ta s{Z) Ee tas{X) 

< exp{-ta + 2t 2 (a s (Z) + i/ s (Z))}. 



□ 



One can observe that as(Z) + i^s(Z) is the variance of as(X). 



(3.2) Corollary. Let (R, C) be 5-smooth margins with the typical table Z = (zij) 
and let X = (xij) be the matrix of independent geometric variables such that EX = 
Z . Suppose that 

< for all (i,j)eS 

ran 

and some a > 1. Then 

(1) For any < e < 1 we have 

Pr{a s (X) < (l-e)as(Z)} < exp{-|^}. 

(2) For any < e < 1 we have 

Pr{. g (X) > (l +e) „ s (Z)} < exp{-^M}. 



Proof. Choosing 

C7\ a + e °s(Z) 
a = eas{Z) and t = 



a s (Z) + u s (Z) 
12 



in Part (1) of Lemma 3.1, we obtain 

(3.2.1) rrW(X) < (1-^(2)} < exp{- 2(gs fff^ (z)) }. 
Furthermore, 

3.2.2 !/ s Z = £ 4 < ^ ^ = <7 S Z . 

^-^ J ran ran 

(i,j)es (ij)es 

By Corollary 2.5, 

5 3 iV 



mn 



(3.2.3) 5 S (Z) > \S 
We recall that 

N 

(3.2.4) — > 5. 

mn 

Summarizing (3.2.1)-(3.2.4), we get 

Pr{a s (X) < (l-e)as(Z)} < exp 

< exp 

< exp 



e 2 as(Z)ran 
2 (mn + aN) 

e 2 \S\5 3 N 
2 (mn + aN) 
e 2 \S\5 3 



2 (ran/N + a) 
f e 2 ^ 4 |S| 1 

^ exp {-2T^| 



and Part (1) follows. 

Let us choose a = eas(Z) in Part (2) of Lemma 3.1. Let 

t - ea s{Z) 



4(a s (Z) + is s (Z))- 



Clearly, to < 1/4 < 1/3. If to < ran/2aN, we choose t = to and if to > ran/2aN, 
we choose t = ran/2aN in Part (2) of Lemma 3.1. Hence if t < ran/2aN, we 
obtain as above in Part (1) 



(3.2.5) 
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f e 2 5 4 \S\ ) 



If t > mn/2aN then 

a s {Z) +vs{Z) < . 

2mn 

Therefore, choosing t = mn/2aN in Part (2) of Lemma 3.1, we obtain 

PrWX) > (1 + e)as(Z)} < exp {-^^ + m '"' " s(Z)) } 

f eas(Z)mn 
< exp ' 

Using (3.2.3), we obtain 

(3.2.6) Pr{a s (X) > (l + e)a s (Z)} < expj-^}. 

Comparing (3.2.5) and (3.2.6), we complete the proof. □ 

Now we can prove the following weaker version of Theorem 1.5. 

(3.3) Proposition. Let us fix real numbers < 5 < 1 and k > 0. Then there 
exists a positive integer q = q(5, k) such that the following holds: 

Suppose that (R, C) are 5 -smooth margins such that n > m > q and let Z = (zij) 
be the typical table with margins (R, C) . Let 

Sd{(i,j): l<i<m, 1 < j < n} 

be a set such that 

\S\ > Smn 

and suppose that the entries of the typical table satisfy the inequalities 

aN 



z. 



< for « = 2<T 1 m 1/3 



mn 



and all € S. 

Suppose further that for the total sum N of entries we have 

N < (mn) 1/s . 

Let 

Sinn 



m i/3 • 

If e < 1, we have 

Pr {DeY>{R,C): a s (D) < (1 - e)a s (Z)} < n~ Kn and 

Pr {D G S(-R, C) : a s (D) > (l + e)a s (Z)} < n~ Kn . 
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Proof. Let X = (xy) be the m x n matrix of independent geometric random vari- 
ables Xij such that EX = Z. By Theorem 1.7, the distribution of X conditioned 
on X G C) is uniform and hence 

Pr {D G S(-R, C) : a s (D)<(l-e)a s (Z)} 

_ Pr{X : (T 5 (X) < (l-e)a s (Z) and lGE(i?,C)} 
~ Pr {X : X G E(i?,C)} ' 

Similarly, 

Vr{DeZ(R,C): a s (D) > (1 + e)a s (Z) } 

Pr {X : (T 5 (X) > (l + e)(7 5 (Z) and XeZ(R,C)} 
~ Pr {X : X G S(i?,C)} ' 

By Theorem 1.7, Lemma 2.2 and Theorem 2.1 we get 

Pr [X G C)} = e" 9(z) |E(i2,C)| > Ar-^H+n) 
for some absolute constant 7 > 0. Since N < (mn) 1 ^, we obtain 

Pr [D G S(-R, C) : <7 S (D)<(l-e)<7 S (Z)} 

< (mn) 7l(m+re) Pr {X : a s (X) < (1 - e)a 5 (Z)} 

and similarly 

Pr{DG S(i?,C): as(D) > (1 + e)as(Z)} 

< (mn) 7l(m+n) Pr{X: > (1 + e)a s (^)} 

for some constant 71 = 7(5) > 0. By Part (1) of Corollary 3.2, 

5 7 mn In 2 n 



m 2/3( 2 + 4m 1 /3) 



Pr{X: a s (X)<(l-e)a s (Z)} < expj- 
while by Part (2) of Corollary 3.2 

Pr{X: a s (X)>(l + e)a s (Z)} < exp j - 



m 2 / 3 (8 + 16mV3) 

and the result follows. □ 

Next, we prove that large entries of the typical table Z belong to a small number 
of rows. 
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(3.4) Lemma. Let (R,C) be S-smooth margins and let Z = (zij) be the m x n 
typical table with margins (R,C). Let a > 2mn/N be a real number. Let 

f ocN , i 
I = < i : za > for some ? >. 

Then 



^ Am 
' ~ 8a 



Proof. By (2.3.1), we can write 



Zij + 1 



Zij 



In I — I = Aj + fj,j for all z, j 



and some Ai, . . . , A m and /xi, . . . , ix n - Since Aj + /ij > for all z and j, without loss 
of generality we may assume that Ai, . . . , A m and /xi, . . . , /x n are positive. 
Let 

7 ={z: Aj < ^} and J = {j : N < ^}. 
If % G J then for some j we have 

mn > — > lnf^±^ > A,- 



n A' ~" % ~~ V "-'./ 



and therefore I C Iq. Similarly, if z^j > aN/mn for some z then j G Jo- Hence 
without loss of generality, we may assume that Jq ^ 0. 
Let us fix a jo G Jo- Then for any z G Jo we have 



z ijo J ~~~ aN 



Hence for all i G Iq we have 



1 f 2mn 1 „ Amn 

< exp < > — 1 < 

Zij y aN J aN 

(using the fact that e x < 1 + 2x for < x < 1). Hence 

aN 



Since 



we conclude that 



Zij > 1 for i e lo- 

Amn 



1/1 < 1/ I < ^ C io mTl < ^ m 

— — aiV ~~ 5a 



□ 



Finally, we prove the main result of this section. 
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(3.5) Proposition. In Theorem 1.5 assume, additionally, that N < {mn) 1 / 5 
(equivalently, drop the upper bound assumption for Zij in Proposition 3.3). Then 
the conclusion of Theorem 1.5 holds (equivalently, the conclusion of Proposition 3.3 
holds). 

Proof. Let us choose 

a = 2<T 1 m 1/3 

and let 

T f aN r } 

I = < i : Zij > tor some j > . 

I mn J 

Since N/mn > 5, we have a > 2mn/N and by Lemma 3.4 we have 

|/| < 2m 2/3 . 

Let 

S = {(i,j)eS: i<£ I}. 

Then 

\S\S \ < n\I\ < 2nm 2/3 , 
and hence for Sq = 5/2 and m sufficiently large, n> m > q(S), we have 

liSol > domn. 



Furthermore, we have 

dm ~ 5m 1 / 3 



x ^ ..N 2N 



iei 

On the other hand, by Corollary 2.5, we have 

<Js{Z) > \S\— > 5 4 N. 
mn 

Therefore, 

(3.5.1) a So {Z) = a s (Z)-a s \ So (Z) > (l - -^-^a s {Z) 
and, similarly, 

(3.5.2) * S (Z) - v s \s (D) > (l- 1 J^)a s (Z). 
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We have 

Pr {D G C) : as(D) < (1 - e)a s (Z)} 

< Pr{DGS(i?,C): as (D) < (1 — e)as(Z)} . 

By (3.5.1) we obtain 

(l-e)as(Z) = (l- S -^y s (Z) < (l-^o)(l-^T73)^(^) 

< (1 - e ) crs ( z ), where 
Sinn 



eo = 



2m 1 /3' 



and m is sufficiently large, n > m > q(S). 

Applying Proposition 3.3 with Sq C S and Sq = 5/2, we conclude that if m is 
sufficiently large, n > m > q(S, k), we have 

Pr {D G S(-R, C) : <7 So (D)<(l-e)<7 5 (Z)} 

< Pr{DGE(i?,C): <7 So (£>) < (1 - e ) <r So (Z)} < 



Similarly, we have 

Pr [D G C) : (75(D) > (l + e)(T S (Z)} 

= Pr {D G S(-R, C) : a 5o (D)>(l + ey s (Z)-a 5V?0 (L>)} 
< Pr [D G C) : a So (D)>(l + e){as(Z)-a S \s (D))}. 

By (3.5.2) we obtain 

(1 + e) (a s (Z) - as\ So (D)) > (1 + e) (l - <r s (Z) 

> (1 + e )(75 (Z), where 
5 Inn 



2m 1 /3' 



and m is sufficiently large, n> m > q(S). 

Applying Proposition 3.3 with S C S and S = 5/2, we conclude that if m is 
sufficiently large, n > m > q(5, k), we have 

Pr {D G E(i?, C) : <r So (£>) > (1 + e) (<r s (Z) - <r s \S (D))} 

< Pr{DGS(i?,C): a So (£>) > (1 + e ) a So (Z) } < n~ K 

and the result follows. □ 
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4. Proof of Theorem 1.5 



It remains to prove Theorem 1.5 in the case of a large (superpolynomial in ran) 
total sum N of entries. More precisely, we assume that N > (ran) 7 since the case 
of N < {ran) 7 is covered by Proposition 3.5 with a sufficiently small 8 < 1/7 (we 
recall that 5-smooth margins are also ^'-smooth with any < 5' < 5). 

The idea of the proof is as follows: given margins (R, C) whose total sum of 
entries is N, we construct new margins (R' 7 C) whose total sum of entries N' is 
bounded by a polynomial in ran and a scaling map 

T: E(i?,C) — >Z(R',C), 

which, roughly, scales every table D G S(-R, C) by the same factor t. We then 
deduce Theorem 1.5 for margins (R, C) from that for margins (R',C). 
We have 

R' « t^R, C « t _1 C and T(Z>) w t" 1 !), 

where "~" stands for rounding in some consistent way. 

In constructing the map T we essentially follow the ideas of [D+97]. 

(4.1) Lattices, bases, and fundamental parallelepipeds. Let V be a finite- 
dimensional real vector space and let A C V be a lattice, that is, a discrete additive 
subgroup of V which spans V. Suppose that dimV = k and let ui, . . . , Uk be a 
basis of A. The set 

II=|^AiWi: < Ai < 1 for z = l,...,/c| 

is called the fundamental parallelepiped associated with the basis u\, . . . ,U}~- 

Suppose that A is an affine space, with dim A = dimV, on which V acts by 
translations: a + v G A for all a G A and v G V and a + (v\ + v%) = (a + v±) + Vi 
for all a G A and v±, v<i G V. Let us choose a G A. The set A a = a + A is called a 
point lattice in A. As is known, the translations v + II : v G A a cover „4 without 
overlapping. 

We will also use the following standard fact. Suppose that Ai D A is a finer 
lattice and let |Ai/A| < oo be its index. Then, for any a, b G A we have 

\(a + U)D(b + A 1 )\ = \A 1 /A\, 

see for example Chapter VII of [Ba02]. 

Let us fix a point lattice A a C A and a fundamental parallelepiped II C V of 
A. Given a point x G A, we define its rounding y = |_^jA a ,n as the unique point 
y G A a such that a; G y + IT. 

In our case, V is the space of real rax n matrices with the row and column sums 
equal to 0, so dim V = (m — l)(n — 1), while A is the affine space of rax n matrices 
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with prescribed integer row and column sums, so that for all D G A and U G V we 
have D + U G A. Furthermore, let A C V be the lattice of integer matrices and let 
A' G Abe the point lattice consisting of integer matrices. 

As is shown, for example, in [D+97], lattice A has a basis consisting of the 
matrices Uij for 1 < i < n — 1, 1 < j < to— 1 that have 1 in the and (z + 1, j + 1) 
positions, —1 in the (i + and + 1) positions and zeros elsewhere. Let IT 
be the fundamental parallelepiped of this basis {Uij}. We call this parallelepiped 
II standard. We note that 

(4.1.1) -2 < Xij < 2 for all i,j and all X G n, X = . 

Finally, for positive integer t let Ai = t~ x A. Hence |Ai/A| = t^ n ~ v> ^ n ~ V) . 

(4.2) The t-scaling map T. Let us choose a positive integer t and an arbitrary 
D G E(i?, C), where = (ri,... ,r m ) and C = (ci,... ,c n ). Let us define a 
positive m x n matrix B as follows. First, we obtain Di by rounding up to the 
nearest integer every entry of t~ 1 D and adding 2 to the result. In particular, Di 
is a positive integer matrix. Let 

B = D 1 -t~ 1 D 0l so D 1 = B + t~ 1 D . 

Clearly, B = (bij) is an m x n matrix with 

(4.2.1) 2<6 y <3 for all 

Let i?' = (r 1; . . . , r' m ) and C = (c[, . . . , c' n ) be the row and column sums of D\ 
respectively. Thus R' and C are positive integer vectors and 

t~ 1 r i + 2n < r'i < t~ 1 r i + 3n for i = l,...,m 

(4.2.2) and 

t~ 1 Cj + 2m < c'j < t~ 1 Cj + 3m for j = 1, . . . , n. 

Let A be the affine subspace of matrices with row sums R' and column sums C 
and let A' G Abe the point lattice of integer matrices. Thus A' = Di + A, where A 
is the lattice ofmxn integer matrices with zero row and column sums, see Section 
4.1. For a matrix D e S(-R, C) we define a matrix T(D) by 

T(D) = Lr 1 D + SjA',n, 

where II is the standard parallelepiped of A; see Section 4.1. In words: given a table 
.D G £(-R, C), matrix T(D) is the unique integer matrix such that the translation 
T(D) + 11 of the standard parallelepiped IT contains t~ 1 D + B. Clearly, T(D) is an 
mxn integer matrix with row sums R' and column sums C . Moreover, since every 
entry of t~ x D + B is at least 2 and because of (4.1.1), matrix T(D) is non-negative. 
Hence we have defined a map 

T : £(-R, C) — > E(#',C"). 

We summarize some of its properties below. 
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(4.3) Lemma. 

(1) For all Y G £(#', C") we have 

|T _1 (y)| < t^- 1 )^- 1 ); 

(2) Let Sc{(i,j): z = 1, . . . , m, j = 1, . . . ,n} 6e a set o/ indices. Then 

t^asiP) < a s (T(D)) < rV s (D) + 5|5| 
/or aWD e S(i?,C). 

Proof. Given F G E(-R', C"), we compute T _1 (y) as follows: we consider the trans- 
lation (Y — B) + IT of the standard parallelepiped IT and observe that 

T- 1 (Y) = {D: t~ 1 De(Y-B) + U and 

D is a non-negative integer matrix j. 

Recall that A C V is the lattice ofmxn integer matrices with the row and column 
sums equal to and that A x = t~ l K. In the affine space ofmxn matrices with row 
sums t~ x R and column sums t~ 1 C let us consider the point lattice A[ = t~ 1 D + Ai 
consisting of matrices t~ x D where D is an integer matrix. Then 

\((y - B) + n) n a; | = |Ai/A| = 

and Part (1) follows. Part (2) follows because of (4.1.1) and (4.2.1). □ 

(4.4) Lemma. Suppose that 

r'i, c'j > (mn) 2 for all 
Then, for any ( > we have 

Pr j.D G £(-R, C) : a s (D)>t(} < /3Pr jV G £(-R', C') : <7 S (y) > c} 
and 

Pr |d G E(i2, C) : a s (D)<t(} < /?Pr \y G E(i?', C") : <7 S (r) < C + 5|5| }, 
where (3 > is an absolute constant. 

Proof. By Part (2) of Lemma 4.3, if a s (D) > t( then a s (Y) > ( for Y = T(D). 
Using Part (1) of Lemma 4.3, we can write 



f , , s 1 \ D G E(R,C) : o-s(L») > *C I 
Pr {D G E(*, C) : , fl (I» > t C } = J |S( ^| ~ ^ 

\m,c)\ 



)(R,C)\ 



l js(fl|c)i l t(ro " 1)(n " 1)pr ( y G E(jR/ ' c ' ] ■ as{Y) - c l- 
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Similarly, by Part (2) of Lemma 4.3, if a s (D) < t( then a s (Y) < ( + 5\S\ for 
Y = T(D) and 

Pr {£> G E(R, C) : a s (D) < tc} 

" ^^f t(m " 1)(n " 1)Pr i Y £ ^ R '^ ■ °s{Y)<C + b\S\). 

It is shown in [D+97] that for sufficiently large margins, the number of contingency 
tables is approximated within a constant factor by the volume of the corresponding 
transportation polytope; see Section 1.2. In particular, estimates of [D+97] imply 
that 



|£(-R', C')\ < /3ivoiP( J R / ,C / ) and \Z{R,C)\ > fa volV(R, C) 

for some absolute constants /5i,/?2 > 0. 
From (4.2.2), we have 

ft > t( r i — 3n) > trill t— J for i=l,...,m and 



m 2 n 



n. 



Cj > t(r>-3m) > tc'^l-^j for j = 1, . 
It follows then that 

vo\V{R,C) > /3 3 t (m - 1)(n - 1) volP(i? , ,C") 

for some absolute constant (3s > 0. The result now follows. □ 

Next, we show that the t-scaling map T almost scales the typical table provided 
the margins R',C are large enough, that is, Z' m t~ x Z. The idea of the proof is 
roughly the following: if margins (R', C) and (R, C) are large enough, then the 
corresponding typical tables Z' and Z roughly optimize the functional J2i j hiXij on 
the corresponding transportation polytopes and hence the map X i — ► tX roughly 
maps Z' to Z. 

(4.5) Lemma. Let Z = (zij) be the typical table with margins (R,C), let Z' = 
(z^) be the typical table with margins (R f , C) obtained by t-scaling and suppose 
that 

z[j > (mn) 4 + 3 for all 



Then 

< for all i,j 



Zij 



tz' 



mn 



and some absolute constant (3 > 0. 
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Proof. First, we prove some useful inequalities for the function 

g(x) = (x + 1) ln(x + 1) — x lux. 

We have 

g(tx)-g(x)= / g'{y)dy= / In I j cfy < / — = ln(te) - In x = In t. 

J x J x v y j J x y 

Also, 

g(x) =(x + 1) ln(x + 1) — (x + 1) lux + (x + 1) lux — x \nx 

= (cc + l)ln(— - — ) + lux = lux + 1 + O ( —) for x > 1. 

V * / W 

Finally, we note that 

g"{x) = 1 



x(x + 1) 
Since from (4.2.2) we have 



Ti < tr[ and Cj < tc'j for all i,j 



we have 



(4.5.1) max q(X) < \nt+ max q(X). 

xev{R,c) xev(R',C) 

Let S be the matrix constructed in Section 4.2 and let W = t(Z' - B) e V(R, C). 
Hence 

Wij > t(mn) 4 for all 

Since 

g{w ij ) = l + lnw ij + o(^—^ and g (z^) = 1 + ln^ + O , 
we have 

g (W)=g(Z') + \nt + o(^—). 

\m 6 n 6 J 

From (4.5.1) it follows that 

(4.5.2) g(Z) - g(W) = (J^ . 

Next, we are going to exploit the strong concavity of g and use the following stan- 
dard inequality: 
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if g"{x) < —a for some a > and all a < x < b then 



'a + b\ 1 . . 1 ... ^ a(6-a) 2 
( — j - 2<?(«) " ^(&) ^ -^1-^- 



If for some i, j we have \wij — z\j\ > (mn)~ 1 Wij, then in view of (4.5.2), for some 
point U on the interval connecting W and Z and all sufficiently large mn, we will 
have 

g(U)>g(Z), 



which is a contradiction. Thus 



Zi 



w 



< 



mn 



for all i,j 



and all sufficiently large mn. Since 



w 



tz' 



z ij (mn) 4 ' 



the proof follows. 



□ 



(4.6) Proof of Theorem 1.5. Without loss of generality we assume that iV > 
(mn) 7 since the case of a polynomially bounded iV is handled in Proposition 3.5. 
Let us choose 

N 



t = 



(mn) { 



and consider the t-scaling map T : T,(R,C) — > T,(R',C) . Since margins (R,C) 
are 5-smooth, we have 

(mn) 6 < N' < (mn) 7 and r'^c'j > (mn) 4 for all i,j 

and all sufficiently large n> m. 

Let us choose < Si < S. It follows by (4.2.2) that the margins (R', C) are 
5i-smooth for all sufficiently large n > m. Let Z' be the typical table of (R', C), 
Z' = (z' tJ ). By Corollary 2.5, 



4 > (SiY 



N' 



mn 



Therefore, for all sufficiently large m + nwe have 

z[, > (mn) 4 + 3. 



The result now follows by Lemmas 4.4, 4.5, and Proposition 3.5 applied to (R\ C). 

□ 
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